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Many high-profile societal problems involve an individual or group repeatedly attacking another - from 
child-parent disputes, sexual violence against women, civil unrest, violent conflicts and acts of terror, to 
current cyber-attacks on national infrastructure and ultrafast cyber-trades attacking stockholders. There is 
an urgent need to quantify the likely severity and timing of such future acts, shed light on likely perpetrators, 
and identify intervention strategies. Here we present a combined analysis of multiple datasets across all these 
domains which account for > 1 00,000 events, and show that a simple mathematical law can benchmark them 
all. We derive this benchmark and interpret it, using a minimal mechanistic model grounded by 
state-of-the-art fieldwork. Our findings provide quantitative predictions concerning future attacks; a tool to 
help detect common perpetrators and abnormal behaviors; insight into the trajectory of a 'lone wolf ; 
identification of a critical threshold for spreading a message or idea among perpetrators; an intervention 
strategy to erode the most lethal clusters; and more broadly, a quantitative starting point for 
cross-disciplinary theorizing about human aggression at the individual and group level, in both real and 
online worlds. 



Human confrontations 125 from one-on-one fights 35 through to collective protests 610 , mass violence 1123 
and even online acts of aggression 24-25 , are of great societal importance. However our understanding of the 
dynamics at the event-by-event level remains limited (e.g. a child's repeated cry-attacks against a par- 
ent 35 ) where each side ('Red' and 'Blue') is engaged in a complex cat-and-mouse game of adaptation and counter- 
adaptation, and where agility and secrecy (e.g. of Red) can enhance the ability to launch attacks 8,9,20 23 (e.g. against 
Blue). While 'big data' approaches to non-confrontational human activities have revealed new patterns 26-33 , the 
presence of aggression and danger means that event records for a particular confrontation run the risk of being 
incomplete or biased 15 . These considerations motivate us to analyze a broad spectrum of heterogeneous event- 
level datasets drawn from independent sources across multiple disciplines, not limited to armed conflict 16-19 , 
crossing from local to global geographic scales and in both real and online worlds. Our data sources are listed in 
the Supplementary Information (SI). We find that in all these systems, the distribution of the severity of events 
and the trend in the timing of events, are each described by a power-law function of the form "AB -C ". 

Results 

Each point in Fig. 1 results from the maximum likelihood fitting of the power-law Ms - " to the tail in the 
distribution of the severity of individual events within a given confrontation, where s is the severity of an 
individual event which, in the case of violent conflict, is the number killed or injured in an attack, a is the 
power-law exponent, M is the normalizing factor, and p is the goodness-of-fit 16-18 . Figure 1A inset illustrates 
this power-law tail distribution, while full details of the statistical fitting procedure are described in the cited 
references of the Methods section. We can analyze event severities and timings separately since they show no 
systematic cross-correlation, as illustrated in the SI. Specifically, while the event severity distribution is stationary 
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Figure 1 | Event-severity benchmark across geographic scales and domains. Each data-point shows (p, a) values for event severity distribution Ms - " 
(Fig. 1A inset) for confrontations. (A) within a given continent (Africa); (B) across the globe, for different actors and different injury levels; (C) within a 
given country (departments in Colombia). (D) shows conventional wars and sexual violence against women 5 . Suicides etc. form a near continuum atp = 
0 with a » 2.5. The darker the color of each data-point, the larger the total number of victims (see SI). Red star shows value for global terrorism 17 , green 
ring is value for entire Africa database, purple ring is value for all interstate wars from 1860-1980. Dashed horizontal line shows theoretical benchmark a 
= 2.5 derived from the simple version of our theory, as described in the text; SI shows a = 2.5 result is robust to generalizations. Red shaded area 
corresponds to goodness-of-fit p< 0.05. Inset in Fig. ID shows empirically determined Red operational network for PIRA in South Armagh 20 . Fig. ID lists 
other empirically determined a values. Domains are omitted in Figs. 1-3 if we lacked the necessary data (see SI). 



throughout the confrontation to a good approximation, the timing of 
individual events is a non-stationary process with periods of initial 
escalation or de-escalation. 

Our results for the timing of individual events are summarized by 
Figs. 2 and 3, where each point results from the maximum-likelihood 
fit of a power-law X^tT^ to the trend in successive inter-event time 
intervals x n between the n'th and the (n + l)'th event within a given 
confrontation, with n = 1, 2, 3 etc. and with x x being the intercept on 
a log-log plot of x n vs. n 19 . This procedure is illustrated in Fig. 3 upper 
inset. The residuals in each least-square fit are approximately 
Gaussian-distributed and i.i.d., as required for the maximum like- 
lihood best-fit (see SI). The event timing results in Figs. 2 and 3 
mostly show escalation (/? > 0) with some de-escalation (/? < 0). 
We do not address how, why or when each confrontation ends (or 
begins) but instead focus on the non-stationary behavior leading up 
to this endgame. 

For the analysis of event severities in Figs. 1 A-1C, the a. exponent 
values for the power-law severity distribution are broadly bunched 
around 2.5 with statistically significant goodness-of-fit, i.e. p > 0.05. 
(See the cited references in Methods for details on how to determine 
and interpret these p values). For the analysis of the trend in the 



timing of events (Figs. 2-3) the p parameter governing the trend 
from the outset of the confrontation, shows an approximate linear 
dependence on logt!, implying that within each domain the con- 
frontations in which initial events are frequent tend to slow down 
over time (/? negative when x x is small) whereas they accelerate if 
events started slowly (/? large positive when x x is large). In Fig. 2A for 
infant attacks, this is particularly remarkable since each point corre- 
sponds to a different infant (and parent), and the experiment under- 
lying each point is performed at separate times. A random process in 
which individuals become victims independently with a constant 
probability, would have yielded p = 0 with an arbitrary a value in 
Fig. 1, thereby explaining the values in Fig. ID for suicides, homi- 
cides, and death by accident and disease, while for timings the /? 
values would have been evenly scattered around ji = 0 with no strong 
linear dependence. (See SI for empirical confirmation of these state- 
ments). Our results benefit from out-of-sample testing: Ref. 34 pro- 
vides a public, time-stamped record of our 2005 pilot study of two 
wars that hinted at a severity distribution Ms~" with a ~ 2.5, while 
Ref. 16 adds seven more and terrorism. Likewise Ref. 19 records our 
201 1 pilot study of two wars suggesting x^rT^ for event timings with 
P linearly dependent on log ti. Here we move beyond wars and 
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Figure 2 | Event-timing benchmark across domains. (A) Each point denotes a unique infant-parent pair, obtained using analysis in Fig. 3 upper inset. 
Underlying events are cry-face attacks by infant (Red) against parent (Blue). The experiment is described in Ref. 4. (B) Each point denotes a unique 
geographic location. Underlying events are street protests by anti-government groups (Red) against Polish government (Blue). (C) Each point denotes a 
unique sector of national cyber-infrastructure. Underlying events are cyber-attacks by foreign group (Red) against indicated sector's defenses (Blue) . (D) 
Each point denotes a particular U.S. financial institution stock. Underlying events are attacks by ultrafast predatory traders (Red) against the remaining 
market of slower global investors (Blue). We can reject the null hypothesis that these linear fits emerge by chance, by randomizing event times and then 
comparing probability distribution of R 2 fits to the real value Rjr eal in order to generate p significance values (Fig. 2A, Rjr eal = 0.74, p = 0.0089; Fig. 2B, 
R real = °- 82 'P = 5-6 x 10 ~ 5 '> Fi g- 2C > R Li = 0,91 > P = om6 > Fi g- 2D > R Li = °- 80 ' P = 0.0087). See SI for more details. 



terrorism with Figs. 1-3 providing blind test results for every dataset 
made available to us in the interim years. 

Discussion 

The confrontations that follow the benchmark behavior generally 
feature an actor (Red, e.g. cyber-hackers, insurgents, terrorists, pro- 
testors, ultrafast traders, infant) who is in principle weaker than its 
Blue opposition (respectively, the national infrastructure, incumbent 
army, security forces, ruling government, global stock holders, par- 
ent), yet who manages to inflict a series of attacks that typically 
escalates (/? > 0 in Figs. 2-3). We develop our explanatory model 
by referring to the most recent and detailed fieldwork available of 
such a Red group 20 : PIRA (the Provisional IRA) who inflicted an 
escalating number of attacks against the stronger British government 
forces (Blue) in Northern Ireland from 1969 onwards 20 . PIRA's 
operational network shown in Fig. ID inset, has a decentralized 
structure consistent with jihadist operational networks 9,21,23 and with 
other clandestine and illicit groups, e.g. online gold farmers 35 . Its 
resources - which in Fig. ID inset are people but for more general 
Red may include technology, predatory algorithms (Figs. 2C-D) or 



even abstract cognitive processes for the case of infant (Fig. 2A) 3,4 - 
are partitioned into clusters ('cells' or 'units') where a cluster's com- 
ponents do not have to be spatially close, just coordinated in some 
way (e.g. by phone). In short, network connections indicate empirical 
evidence of some coordinated activity, not spatial proximity. 

Clusters can begin to coordinate together over time (i.e. clusters 
coalesce) 9-20 22,29 but can also lose internal coordination (cluster frag- 
ments) under conditions of external or internal stress 9,20 22-29 , just as a 
cluster of animals disperses if in danger or a start-up company dis- 
solves if it loses common purpose 36 . Adding the empirical finding 
that larger social clusters show more churn than smaller ones 29 , yields 
the simplest form of our dynamical cluster theory whose exact solu- 
tion (see SI) is a Red cluster-size distribution of the form Ms~* with a 
= 2.5, consistent with Fig. ID inset, with gang sizes in Asia and 
Chicago (a = 2.3) 12 and with cyber-crowds of traders through the 
proxy of trade size (a = 2.5) 31 . Following recent empirical findings 
linking size to lethality 14,18 , we take a cluster's size as proportional to 
the severity of an event in which it participates, hence reproducing 
the severity distribution Ms~" with y. = 2.5. We explored many 
generalizations of this theory but find that Ms~" with a ~ 2.5 is 
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Figure 3 | Event-timing benchmark focusing on violent confrontations. 

For a given symbol, each data-point shows the (xi, P) values obtained from 
fitting trend in inter-event times (upper inset) within a confrontation in a 
unique region or city within a given country, mostly in Africa but also 
including Middle East and South America. SI contains key to symbols. 
Several best-fit lines are shown as a guide. Separate symbols for attacks 
against government forces, and against civilians. Red star shows result for 
global terrorism. Upper inset shows escalation of Red attacks in Belfast. 
Lower inset shows Belfast (solid red square) is abnormal compared to 
Armagh and Down (red squares with yellow centers). 

remarkably robust (see SI). Changing the rigidity of larger Red clus- 
ters successively from more rigid to less rigid, moves the a values 
from below 2 to above 3, hence providing an interpretation for indi- 
vidual confrontations in Figs. 1A-C. Restricting connectivity 
between Red clusters to physical contact on a two-dimensional grid 
like an urban street setting or battlefield, pushes i toward 1.9 with a 
weaker power-law (p — » 0) hence explaining most of the conven- 
tional wars in Fig. ID and the a. = 2.0 value for Chicago strikes 10 . 

The notion that Red's self-organized, decentralized cluster struc- 
ture (Fig. ID inset) helps it adapt faster and/or better than Blue, is 
consistent with recent findings that organic structures are more con- 
ducive to innovation than bureaucratic ones 36 . Indeed, ultrafast tra- 
ders (Red, Fig. 2D) carry out their attacks in under a second. We 
introduce x(n) to represent Red's relative advantage over Blue fol- 
lowing the last (n'th) attack, where x(n) follows a general stochastic 
process. For simplicity, we set the instantaneous rate of Red attacks as 
proportional to x{n) when x{n) > 0 (i.e. when Red has a relative 
advantage) and zero when x(n) < 0 (i.e. when Blue has a relative 
advantage) though this can be generalized. The rate of Red attacks in 
a confrontation that is generally escalating, then scales as x(n)| rms oc 
n 11 ' (see SI) where /?' characterizes the correlations in x(n) (/?' = 0.5 
for an uncorrelated process). The time between attacks, which is 
approximately the inverse rate, is therefore proportional to «~' J 
enabling us to identify /T = /?. This explains why x^rT 11 describes 
the attack timings and implies that if /? > 0.5, Red's lead over Blue 
follows a positively correlated process, while it follows a negatively 
correlated one if 0 < fS < 0.5. Confrontations that de-escalate (i.e. [> 
< 0) can be treated similarly. Our theory then reproduces the linear 
dependence between and log X\ if we introduce coupling between 



the underlying x(n) processes. Such coupling could arise if the same 
Red entity underlies attacks in different places, e.g. in Fig. 2B the 
same social movement underlies protests in different locations. 

Figures 1-3 reveal surprising dynamical equivalences between 
confrontations and hence offer novel data proxies and cross-domain 
insights: The escalation of events in Magdalena, Colombia (black 
oval ring) is representative of all confrontations in Fig. 3; the relative 
position of General Electric (GE) in Fig. 2D makes predatory trade 
attacks on it akin to cyber-attacks on the Hi-tech Electronics sector 
(Fig. 2C) which in turn mimic specific infant-parent dyads (Fig. 2A) 
and protest locations (Fig. 2B); and the conflict in Sierra Leone, 
Africa, has the same (p, a.) in Fig. 1 as the narco-guerilla war in 
Antioquia, Colombia. Deviations from the benchmark behavior act 
as a novel alert mechanism for abnormalities in Red and/or Blue 
behavior, e.g. Angola in Fig. 1A, which serves to warn researchers 
against using such a confrontation as representative. The time-inter- 
val abnormality in Fig. 3 (upper inset) turns out to straddle the 
'Bloody Sunday' attack by Blue on civilians on 30 January 1972, 
implying that neighboring points offer insight into the build-up to, 
and consequences of, an extreme Blue intervention. Interestingly 
Bloody Sunday appears as the culmination of escalating PIRA 
attacks, not their trigger, hence raising new questions about its stra- 
tegic importance. The fact that Belfast's (x lt ji) values in Fig. 3 (lower 
inset) destroy any linear dependence, is consistent with the recent 
fieldwork finding 20 that Belfast's PIRA network is quite distinct to 
Fig. ID inset. The fact that sexual attacks against women do not 
appear as an outlier in Fig. 1, hints at some hidden clustering (like 
Fig. ID inset) of attackers or attacks. 

We have shown that both the severities and the timings of events 
in a wide range of systems, follow a power-law functional form. 
There are various practical prediction tools and policies that follow 
from our work, as we now discuss. Suppose some sporadic attacks 
have been observed in a given location or sector in the real or online 
world. If the trend in successive time-intervals between attacks is 
found to follow Tin - '', this suggests a single Red-Blue process 
(x(n)) underlies them. Assuming Red dominates the Red-Blue 
dynamic x(n) (i.e. Blue has not yet counter- adapted), this points to 
a single attacking Red individual or group. If attacks then emerge in 
different locations or sectors, detecting an approximate linear rela- 
tionship in P vs. log t 1 points to this same Red operating in these 
different places. Figure 2C hence supports media speculation that 
current cyber-attacks against different sectors of US infrastructure 
come from a single Red entity 24 . Likewise, Fig. 2D suggests that a 
common set of predatory algorithms and/or trading firms (Red) may 
underlie recent 'flash' instabilities in different stocks 25 . For Fig. 2A, 
the independence of the participants suggests that this linear pattern 
is revealing a new innate feature of how infants and parents interact. 

Now imagine the scenario in which two attacks occur in a new 
location that was previously quiet, and that this same Red is sus- 
pected. An estimate for /i in this new location can be read off from the 
existing fS vs. log x 1 plot by inputting this single inter-event time as an 
estimate for Tj, Future attack times can then be estimated using T^n - ^ 
(see SI for examples). 

Next consider the severities of events as they begin to emerge in a 
given sector. Suppose a crude Ms~ a distribution is found with a ~ 2.5 
andp > 0.05. This points to Red having a similar delocalized cluster 
structure to our model. Indeed, even without any observed events 
and hence without any event severities from which to estimate the 
distribution, the weight of evidence in Fig. 1 suggests that any future 
confrontation involving a similarly structured Red will produce a 
severity distribution Ms~" with a ~ 2.5 and p > 0.05. The expected 
number of victims in a future attack is therefore approximately [(a — 
l)/(a — 2)]s min where s min is the cut-off in the maximum-likelihood 
fit 37 . Taking a ~ 2.5 as in Fig. 1 and s min ~ 1, this expected number is 
3, which happens to coincide with the recent Boston marathon 
attack. The probability the next attack will be twice as lethal, is 
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(s/s mill ) 1_a ~ (s/s min )^ 1 - 5 with s = 6, giving 0.07 (i.e. 7%). The severity 
of the most fatal attack will grow as the number of attacks n grows, 
following n 1/<a_1> ~ n 0 67 . Dividing attacks equally into less violent and 
more violent, the fraction of victims falling in the most violent half is 
given by 2~ (a ~ 2)/{a ~ 1) = 0.8 meaning that a few attacks will produce 
the majority of the victims. Another relevant consequence of our 
clustering theory is that the ongoing coalescence-fragmentation pro- 
cess means that a lone wolf actor is only truly alone for short periods 
of time, which is again consistent with recent field studies 22 , and 
provides an estimate for how long ago contact was made with other 
Red clusters. 

The stability of a in Fig. 1 throughout a given confrontation (see 
SI) suggests that the corresponding Red group self- organizes rapidly 
after its inception, as confirmed by our model's dynamics, and hence 
latent Red groups that have not yet launched any attacks may already 
have a structure resembling Fig. ID inset. More generally, although 
the overall command structure of a present or future Red might be 
hierarchical, or publicly portrayed as so, our theory aligns with recent 
empirical findings 9 ' 20,21,23 in predicting that operationally Red will 
self-organize into a far flatter, clustered structure similar to Fig. ID 
inset. As a corollary, our clustering theory also identifies a novel 
'Achilles Heel' for such a Red: The self-organized nature of the clus- 
tering means that Blue can avoid having to find and destroy the 
largest (i.e. most lethal) Red clusters, by instead regularly breaking 
up smaller (i.e. less powerful) ones. The mathematics specifies con- 
ditions required to keep formation of large (lethal) clusters below a 
desired rate, and so reduce the threat level of large future attacks. It 
also warns that if Blue is insufficiently active in counter-measures, 
and hence the overall rate at which it fragments Red clusters becomes 
too small (Vf rag <^(NTniV) 1 where N is an estimate of Red's size 38 ) 
then Red will grow exponentially fast into one super-cluster of max- 
imum possible lethality. Finally, our clustering theory predicts a 
necessary condition 39 v frag p/^' coal g > 1 that must be met before a 
covert message or doctrine can spread within Red, where v coa i is 
the cluster coalescence rate, p is the transmission rate of the message 
between two people in a Red cluster, and q is the rate at which this 
message gets forgotten or corrupted. 

Methods 

The power-law analysis that we use to obtain our results in Fig. 1 for the tail in the 
distribution of the severity of individual events, follows exactly the state-of-the-art 
testing procedure described in Refs. 16-18 and firmly established in Ref. 40. Our 
analysis of the trend in the timings of individual events follows exactly the method 
presented in Ref. 19. Our proposed model of cluster coalescence- fragmentation for 
Red, which reproduces the 2.5 power-law result for the severity distribution, com- 
prises a population of objects (agents) that self-organize into clusters according to the 
stated rules of cluster coalescence and fragmentation. The mathematical derivation of 
the 2.5 result is given in the SI. We have also investigated many variants of this 
coalescence-fragmentation cluster model and found (see SI) that most retain a power- 
law with exponent near 2.5. For the trend in the timings of attacks, a null model 
comparison showing the statistical significance of our benchmark result is given in 
the SI, together with the derivation of our stochastic model for Red's relative 
advantage over Blue which reproduces this timings benchmark. 
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